D^3G: Novel Approaches to Data Statistics, Understanding and Preprocessing on the Grid

نویسندگان

  • Alexander Wöhrer
  • Peter Brezany
  • Lenka Nováková
  • A Min Tjoa
چکیده

Relocating the code for data preprocessing (DPP) closer towards the data source is the overall task of the DG framework (Data Statistics, Data Understanding, Data Preprocessing on the Grid), developed within a joint project of the University of Vienna, the Vienna University of Technology and the Czech Technical University. This work presents the data service side architecture to gather data statistics on-the-fly and use them in remote DPP methods on query results as well as an approach to gather exact continuous data statistics for whole tables in a database on the Grid. The performance results of our prototype implementation are showing low running costs for the continuous data statistics inside the database and also the feasibility of our proposed data service side functionality.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Understanding the 2019-novel Coronavirus (2019-nCoV) and Coronavirus Disease (COVID-19) Based on Available Evidence - A Review Study

Since December 2019, a new coronavirus, called the 2019-novel coronavirus (2019-nCoV), triggers pneumonia outbreak from Wuhan (Huanan seafood market) across China, which now poses major health threats to public health. The Corona Virus Disease-2019 (COVID-19) epidemic by 2019-nCoV is spreading worldwide, and by March 1, 2020, 67 countries, including Iran, have been affected. According to worldw...

متن کامل

Signal processing approaches as novel tools for the clustering of N-acetyl-β-D-glucosaminidases

Nowadays, the clustering of proteins and enzymes in particular, are one of the most popular topics in bioinformatics. Increasing number of chitinase genes from different organisms and their sequences have beenidentified. So far, various mathematical algorithms for the clustering of chitinase genes have been used butmost of them seem to be confusing and sometimes insufficient. In the...

متن کامل

Diagnosing the Iranian L2 Writing Ability Using Self-Assessment and Level Specific Approaches

The objectives of this study were (a) to examine the writing performance of L2 learners on the level-specific tasks based on Common European Framework of Reference (CEFR) and (b) to study the likely difference between students' self-assessed level of writing and those reported by raters. The study was conducted with 138 Iranian students at BA and MA levels in Alborz Institute of Higher Educatio...

متن کامل

انجام یک مرحله پیش پردازش قبل از مرحله استخراج ویژگی در طبقه بندی داده های تصاویر ابر طیفی

Hyperspectral data potentially contain more information than multispectral data because of their higher spectral resolution. However, the stochastic data analysis approaches that have been successfully applied to multispectral data are not as effective for hyperspectral data as well. Various investigations indicate that the key problem that causes poor performance in the stochastic approaches t...

متن کامل

Weighted-HR: An Improved Hierarchical Grid Resource Discovery

Grid computing environments include heterogeneous resources shared by a large number of computers to handle the data and process intensive applications. In these environments, the required resources must be accessible for Grid applications on demand, which makes the resource discovery as a critical service. In recent years, various techniques are proposed to index and discover the Grid resource...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006